-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ct 854/change overzealous connection closing #428
Ct 854/change overzealous connection closing #428
Conversation
…ous_connection_closing
Default still to release.
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the dbt-snowflake contributing guide. |
@@ -151,6 +152,18 @@ def auth_args(self): | |||
result["client_store_temporary_credential"] = True | |||
# enable mfa token cache for linux | |||
result["client_request_mfa_token"] = True | |||
|
|||
# Warning: This profile configuration can result in specific threads, even just one, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Context/Alternatives in PR description.
My proposed way of handling what appears to be a rare bug. Warn people against it. Don't encourage it.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh great pick up! wouldn't of wanted this one to disappear , all seems okay to me!!
🎉!!! This is great, what a fantastic solution. Could the issue with connection hanging be related to latency? That's my gut feeling with this, is that something funky with timeouts or Snowflake having intermittent issues. I'll see if I can run tracing with this and just keep repeating the runs until it happens. I'm ~250ms away, so we'll see how often it happens. To test hanging connections, I'll try the following:
If this works, I'll try with larger models, to see if that triggers it. I'm sure there is a correlation between either shorter runs or longer ones. Or maybe it's just random 🤷. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is related to the issue that we're seeing in local unit tests, and in the scenario where we don't specify the number of workers for unit tests.
@joshuataylor After digging in this more, my only real concern was with this behavior resulting in financial cost to users without their knowing -- appears NOT to be the case both in docs and in anecdotes from users. I believe the bug I ran into -- where execution on one thread hung open for an unusual amount of time -- was related to intermittent snowflake bottlenecks rather than anything caused by leaving open threads in the pool. |
Revised code to meet some design specs and confirmed the following after talking to snowflake folks:
This one was an adventure and I thank everyone for their patience/insight in getting us to knowing these things! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Thanks for digging in and investigating this to understand + confirm the underlying behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM great catch on flip!
resolves #201
closes #203
Description
Each time
dbt
talks to a Snowflake warehouse, connections are established and reestablished repeatedly. That's a lot of time wasted on this step. Why not keep the connection(s) already opened maintained for the duration of the execution? That's what this PR does.Teasing out the existing behavior
Set threads to 1 in your profiles.yml, without the
reuse_connection
param, and watch entries arrive inNew behavior
Just add a
to your Snowflake connection in
profiles.yml
.and a
dbt run
over Jaffle shop goes from 7 unique login entries in Snowflake to 1. The timesave is upwards of 2/3. Pretty big deal!Why this?
@joshuataylor (❤️) gave us a proof of concept fix in
https://github.com/dbt-labs/dbt-snowflake/pull/203
. I simplified the code down and added lines to enable users to plug and play withprofiles.yml
. I also added unit testing for this.I was worried about thread safety since
self.lock
is a bit of obtuse automagic. Eventually, I realized instead of reinventing the wheel, I could justsuper()
up toSQLConnectionManager
from dbt-core and it's many safety checks. Otherwise, we're kind of at the mercy of largely undocumented dbt-snowflake-connector methods.A bug
Jeremy warns of this here. I ran into something like this once without looking for it. Every model took some <2 seconds and the last thread hung open for 8. Trials did not trigger this for any combination ofclient_session_keep_alive
andrequest_connection
except for the former asTrue
and the latter asFalse
respectively.We could try to manually close threads but setting an explicit timeout seems sketchy to me in the opposite direction.I thought about closing any unused threads but how would you do something like that in this Python runtime? At which point do you have it begin running over open threads and decide which to close prematurely. If anyone has the idea, by all means, I'm super game. Just unsure of how to.Things I'm uncomfy withTechnically with this solution, the connection isn't ever closed. This shows up in the above "bug." It's just cleaned up when the process exits. Acceptable or not in cases besides the bug above?Update: We close all thread connections in the adapter context today before trying to teardown the dbt runtime, so this has been treated.
Did you document this?
That I did. Corresponding docs PR up over yonder.
Checklist
changie new
to create a changelog entry